Skip to content

Mekhanik evgenii/fix 2284#2654

Open
EvgeniiMekhanik wants to merge 9 commits into
masterfrom
MekhanikEvgenii/fix-2284
Open

Mekhanik evgenii/fix 2284#2654
EvgeniiMekhanik wants to merge 9 commits into
masterfrom
MekhanikEvgenii/fix-2284

Conversation

@EvgeniiMekhanik

Copy link
Copy Markdown
Contributor

No description provided.

@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as draft May 6, 2026 13:51
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch 7 times, most recently from 5e1c6c8 to 71e4ce0 Compare May 12, 2026 16:51
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as ready for review May 12, 2026 16:51
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch 2 times, most recently from 33e779d to 186a89f Compare May 13, 2026 07:04
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch from cefbf32 to 1ebe940 Compare May 18, 2026 09:22
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as draft May 19, 2026 03:17
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch 8 times, most recently from b89b90b to 4d303b8 Compare May 22, 2026 16:26
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as ready for review May 22, 2026 16:27
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch 2 times, most recently from 5ef34b1 to 8915a2b Compare May 25, 2026 12:48
Comment thread fw/connection.h Outdated
Comment thread fw/sock.c
Comment thread fw/sock.c
Comment thread fw/sock.c Outdated
Comment thread fw/ss_skb.c Outdated
Comment thread fw/http.c
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch 3 times, most recently from e192eeb to 2b6fbc9 Compare June 1, 2026 11:16
@EvgeniiMekhanik EvgeniiMekhanik requested a review from const-t June 1, 2026 11:17
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as draft June 2, 2026 09:45
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch 2 times, most recently from 0ac79f5 to 7a53317 Compare June 2, 2026 16:58
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as ready for review June 2, 2026 17:19
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch from 7a53317 to 1f9c14d Compare June 3, 2026 06:54

@const-t const-t left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but I left small suggestion.

Comment thread fw/http.c Outdated
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as draft June 8, 2026 09:38
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch from 55cef6f to a5f86f7 Compare June 8, 2026 11:35
@EvgeniiMekhanik EvgeniiMekhanik marked this pull request as ready for review June 8, 2026 12:24
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch from a5f86f7 to 1d9a1d1 Compare June 8, 2026 13:12

@krizhanovsky krizhanovsky left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pull request is probably fine, I reviewed only a couple of patches from it. The only reason for the reject is that I believe we need to adjust our process for so heavy code modifications. @const-t @EvgeniiMekhanik I want to have a discussion on this, probably on the upcoming meeting.

The main problem is that the whole PR is huge having 9 commits of changes from the skb layer to HTTP, reworked callback architecture and so on. The pull request is an implementation of #2284 feature. However, there is no any single commit, message or comment neither in the code nor in the issue or pull request about how all the 9 commits contribute to the problem solution (only 1d9a1d1 is an exception with a clear purpose).

For example, 1f9c14d :

Rework HTTTP/2 -> HTTP/1 request transformation

why do we need this to solve #2284?

Use pool-based approach for HTTP/2 to HTTP/1 request transformation.

Maybe it's fine

Reuse the same pool-based approach already used for HTTP/1 request forwarding, providing a unified implementation.
Eliminate extra header copying. Previously, headers were written into the linear skb area and then copied by _pskb_copy_fclone. Now headers are written directly into skb fragments, reducing memory usage and copy overhead.

I can review the change and see that the code is correct, doesn't hurt performance and doesn't impact security. However, why the change is required? Which problem it solves? It doesn't clear now and will not clear in a year to review what for this code change was made.

Please, in the pull request write a general design for the #2284 feature and describe each of the commits as the steps reaching the design goal.

Why this is important

  1. The big changes change the project architecture and the whole team must have clear and aligned understanding of the changes. We must review not only the code itself, but the design and architecture.
  2. Mature complex projects always have questions "why this code is written this way?" and the current state of the commits doesn't answer this question.

@ykargin @artem98 FYI - I think you also need to keep this in mind and integrate on your projects.

Comment thread fw/sock.c
}
if (*skb_head && !TFW_SKB_CB(*skb_head)->is_head)
ss_skb_setup_head_of_list(*skb_head, mark, tls_type);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@const-t @EvgeniiMekhanik I had a look onto b68d2ce , which introduced the code and IIRC this also fixed the case when we push a message into a socket and fail somewhere on a middle skb, so a partial message (broken data) is in the write queue. I'm I wrong and is it safe to remove the code now?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we considered handling ENOMEM in a special way without tearing down the connection, later we decide to don't do it and immediately drop connection. For example in tfw_tls_encrypt we do not clean up the socket queue on error, potentially leaving broken data behind. This is currently acceptable because the connection is immediately terminated and the entire queue is subsequently cleaned up.

Comment thread fw/ss_skb.c Outdated
@@ -1763,7 +1748,7 @@ void
ss_skb_dflt_destructor(struct sk_buff *skb)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it's just a regular function, not a destructor and not 'default', any more, so let's rename it to something more appropriate

@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch from 1d9a1d1 to dead18c Compare June 15, 2026 10:13
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch from dead18c to 405c93e Compare June 25, 2026 18:28
`on_tcp_entail` is always called after `on_send`, so both callbacks
can be combined into a single union, freeing an additional 8 bytes
for callback private data. This private data will be used to store a
pointer to the request, allowing us in the future to retrieve the
request in `on_tcp_entail` from the skb head before it is enqueued into
the socket write queue, in order to decide whether to drop or entail it.

We decided to do so because the `TFW_SKB_CB` structure has very limited
space, and there are no extra 8 bytes available to store a request
pointer. We also cannot use the `opaque_data` pointer in `TFW_SKB_CB`
to store a request pointer and implement a corresponding skb destructor,
because if the server connection is destroyed, the destructor may be
invoked from `ss_do_send -> ss_skb_queue_purge` or
`ss_tx_action -> ss_skb_queue_purge`. In such cases, accessing the
request structure may result in a use-after-free, since the request may
already have been freed during server connection cleanup.

This is a preparatory step toward fixing excessive memory usage caused
by forwarded requests whose client connection has already been closed
but remain in the server’s forwarding queue. These requests should be
dropped before being placed into the server socket’s write queue.

Part-of: release-server-queued-requests
Issue: #2284
- Remove `opaque_data` from `skb->cb`, now use TfwClientMem
  instead. Use only default skb destructor to decrement
  client memory when skb is freed.
- Save http2 response in `on_send_data`. Pass NULL (as a conn)
  in `on_send` callback, if appropriate socket is not active or
  conn is real NULL (previously we don't call `on_send` callback
  in such case). Destroy `on_send_data` directly in `on_send`
  callack if we can't send data (conn is NULL).

This patch simplifies the code and makes it consistent with the previous
changes to the `TFW_SKB_CB` structure, which already provides dedicated
storage for data associated with the `on_send` and `on_tcp_entail`
callbacks. Response objects are now stored in this dedicated storage.
This allows us to apply the same approach to requests, allowing them to
be accessed safely from the `on_send_data` and `on_tcp_entail_data`
callbacks.

Part-of: release-server-queued-requests
Issue: #2284
- Rework `on_tcp_entail` callback invocation. Now it is called
  from `ss_skb_tcp_entail_list` not from `ss_skb_tcp_entail`.
  If this callback fails we drop all skbs from appropriate
  list.
- Implement `tfw_http_req_on_tcp_entail` callback to check
  that client connection for sending request is not closed
  and drop skbs and request if connection is already closed.
- Implement `ss_skb_copy_cb` function for correct copying
  of skb->cb (including all callbacks and destructors).

Previously, requests that had already been placed into a server
connection's `fwd_queue` were still forwarded to the server even
after the originating client connection had been closed. Although
the responses to such requests were no longer needed, the requests
themselves were still fully transmitted to the backend.
This patch changes this behavior by performing an additional client
connection state check immediately before entailing a request's skbs
to the server socket write queue for transmission. If the client
connection has already been closed, the request is discarded together
with all associated skbs instead of being sent to the server.
The chosen approach has a theoretical disadvantage: if a large number
of requests from a reset client connection coexist with many requests
from active client connections and the TCP window is small, requests
belonging to the inactive client connection may remain in the server
connection queue for an extended period of time, consuming memory.
An alternative approach would have been to drop the server connection
and purge requests belonging to inactive client connections from the
server send queue during reconnection. This approach was rejected
because it would also impact active client connections by forcing their
requests to be retransmitted.
Since the server connection is also dropped upon receiving the first
response byte for an inactive client connection (see bac89d1), the
current implementation provides a practical and effective fix for
issue #2284.

Part-of: release-server-queued-requests
Closes #2284
Previously there was a sofisticated logic in
`ss_skb_tcp_entail_list` function, which we are
planning to use to handle ENOMEM in a special way
and do not drop connection. Since we decide to
don't do it, remove this logic at all.
(Made here since we already change `ss_skb_tcp_entail_list`
 in this PR).
- Remove unnecessary `skb_copy_cb` (`__pskb_copy_fclone` already
  do it). Instead of it zero all fields for the cloned skb, which
  are related to client memory accounting (since we set client
  memory for the new cloned skb manually). Zero all other fields
  for original skb, we need appropriate callbacks only for skbs,
  which we will send to the socket write queue.
- Move `ss_skb_on_tcp_entail` under `if (TFW_SKB_CB(skb)->is_head)`
  to show that it should be called only to the head of the list.
- Add comments.
- introduce skb callback container. There are currently many cases
  where different callbacks are required for request/response and
  HTTP/2 frame sending. Since the space available in `skb->cb` is
  limited, introduce a callback container structure and store a
  pointer to it in `skb->cb` instead of embedding callback-specific
  data directly. In addition to improving code clarity and freeing
  space in skb->cb, this change allows an unlimited number of callbacks
  to be added in the future.
Since now we use only default skb destructor for
client memory adjusting, we can remove appropriate
pointer from `skb->cb` and save 8 bytes.
During making frames we adjust skb length to recalculate
`snd_wnd`. It seems when we send http1 data, it's better
to work in the same way. (Made here since we already
change `ss_skb_tcp_entail_list` in this PR).
Use pool-based approach for HTTP/2 to HTTP/1 request transformation.
- Reuse the same pool-based approach already used for HTTP/1 request
  forwarding, providing a unified implementation.
- Eliminate extra header copying. Previously, headers were written into
  the linear skb area and then copied by `_pskb_copy_fclone`. Now headers
  are written directly into skb fragments, reducing memory usage and copy
  overhead.
- do not allocate a new skb during http1 request/response preparing. We
  should remove all old skbs except skb_head. In `skb_head` we can remove
  all fragments and linear data and reuse it, instead of new skb allocation.

This patch demonstrate a very good performance improvement for big requests
without cache:
2284:
finished in 50.04s, 330629.14 req/s, 202.22MB/s
finished in 50.04s, 328837.60 req/s, 201.04MB/s
finished in 50.04s, 329637.96 req/s, 201.61MB/s
master:
finished in 50.04s, 328963.94 req/s, 201.21MB/s
finished in 50.06s, 329731.98 req/s, 201.58MB/s
finished in 50.04s, 328985.10 req/s, 201.17MB/s

2284 2K:
finished in 50.04s, 330025.50 req/s, 201.98MB/s
finished in 50.04s, 329206.90 req/s, 201.32MB/s
finished in 50.04s, 329906.58 req/s, 201.74MB/s
master 2K:
finished in 50.03s, 277741.36 req/s, 169.92MB/s
finished in 50.03s, 294516.36 req/s, 180.08MB/s
finished in 50.03s, 298020.96 req/s, 182.21MB/s

2284 4k:
finished in 50.03s, 313275.10 req/s, 191.66MB/s
finished in 50.03s, 313052.38 req/s, 191.45MB/s
finished in 50.03s, 304237.44 req/s, 186.14MB/s
master 4K:
finished in 50.03s, 261815.08 req/s, 160.15MB/s
finished in 50.03s, 273399.16 req/s, 167.18MB/s
finished in 50.03s, 270743.96 req/s, 165.64MB/s

This patch is also important in the context of fixing issue #2284.
As described earlier, when a large number of requests from a reset
client connection coexist with requests from active client connections
and the TCP window is small, requests belonging to inactive client
connections may remain queued on the server connection for an extended
period of time, consuming memory. This patch significantly reduce this
memory overhead. Previously, request headers were stored in the skb
linear data and were copied during `_pskb_copy_fclone` (doubling memory
consumption!) Request headers are now stored in skb fragments instead,
avoiding the copy in `_pskb_copy_fclone` and reducing memory usage by
approximately 50%.
Remove `ss_skb_dflt_destructor` - this function
is currently used only in one place. Additionally,
the function name does not accurately reflect its
behavior.
@EvgeniiMekhanik EvgeniiMekhanik force-pushed the MekhanikEvgenii/fix-2284 branch from 405c93e to adfa33a Compare June 26, 2026 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants